In [47]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
import matplotlib.patches as mpatches
import seaborn as sns
from plotly import graph_objects as go
import plotly.express as px
from datetime import datetime
import math as math
import scipy 
from scipy.stats import norm
from scipy import stats as st
In [164]:
import sys 
import warnings
if not sys.warnoptions:
    warnings.simplefilter('ignore')

Online Store Conversion: AB Testing Analysis

This analysis covers user behavior at an online retailer.

The relevant data encompass three datasets: one for grouping participants in the AB tests, one for events that users of the website engaged in and one for enrollment information about each participant.

The AB test assesses whether users improve their conversion rates within 14 days of their first visit to the site. A is the control group. B is the test group, comprised of users exposed to an improved recommendation system.

After initial analysis of the data, there are analyses of user activity and activity by date, as well as conversion analysis and statistical analysis of the AB test.

Preprocessing

In [111]:
participants=pd.read_csv('final_ab_participants_us.csv')
In [3]:
participants.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 18268 entries, 0 to 18267
Data columns (total 3 columns):
user_id    18268 non-null object
group      18268 non-null object
ab_test    18268 non-null object
dtypes: object(3)
memory usage: 428.3+ KB
In [4]:
participants.describe()
Out[4]:
user_id group ab_test
count 18268 18268 18268
unique 16666 2 2
top 023D4A062F40F10B A interface_eu_test
freq 2 9655 11567

The AB test for analysis concerns only the Interface EU Test for which enrollment and the test was completed. So all preprocessing and analysis will focus on the relevant rows of the dataset.

In [112]:
#creating df for ab test participants
eurtest=participants[participants.ab_test=='interface_eu_test']
In [6]:
eurtest.shape
Out[6]:
(11567, 3)
In [7]:
eurtest.describe()
Out[7]:
user_id group ab_test
count 11567 11567 11567
unique 11567 2 1
top A02AC18503A789EA A interface_eu_test
freq 1 5831 11567
In [8]:
eurtest.sample()
Out[8]:
user_id group ab_test
16477 9580F21B058C7721 B interface_eu_test
In [9]:
eurtest.user_id.nunique()
Out[9]:
11567
In [10]:
eurtest[eurtest.isna().any(axis=1)]
eurtest[eurtest.duplicated()]
Out[10]:
user_id group ab_test

AB Grouping Dataset Precis

There are no nulls nor duplicates in the EU test dataset.

The testing groups are not equal. There are 95 more A group members than members in group B.

That represents less than 1 percent of all testing participants, so the analysis and testing will continue with this difference in place.

Unequal groupings in this case will not increase false positives, but may increase the risk for errors of omission, false negatives resulting from not rejecting false null hypotheses.

In [113]:
events=pd.read_csv('final_ab_events_us.csv.csv',parse_dates=['event_dt'])
In [12]:
events.info(memory_usage='deep')
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 440317 entries, 0 to 440316
Data columns (total 4 columns):
user_id       440317 non-null object
event_dt      440317 non-null datetime64[ns]
event_name    440317 non-null object
details       62740 non-null float64
dtypes: datetime64[ns](1), float64(1), object(2)
memory usage: 64.8 MB
In [13]:
events[events.duplicated()]
Out[13]:
user_id event_dt event_name details
In [14]:
events.isnull().sum()
Out[14]:
user_id            0
event_dt           0
event_name         0
details       377577
dtype: int64
In [15]:
events.user_id.nunique()
Out[15]:
58703

Events Dataset Precis

There are nearly quadruple the amount of events as unique participants in the AB testing dataset.

The nulls in the details column reflect actions for which money was not listed. As 3 out of 4 events didn't involve a dollar figure, the amount of nulls is logical and will remain as is.

In [114]:
#combining user test data with event df
eurtestevent=pd.merge(eurtest, events, how='left')
In [17]:
eurtestevent.isnull().sum()
Out[17]:
user_id           0
group             0
ab_test           0
event_dt        717
event_name      717
details       70559
dtype: int64
In [115]:
eurtestevent=eurtestevent[eurtestevent['event_name'].notnull()]
In [19]:
eurtestevent.user_id.nunique()
Out[19]:
10850
In [20]:
eurtestevent[eurtestevent.duplicated()]
Out[20]:
user_id group ab_test event_dt event_name details
In [21]:
eurtestevent.groupby('user_id')['event_name'].unique().count()
Out[21]:
10850
In [22]:
eurtestevent.event_name.value_counts()
Out[22]:
login           35145
product_page    23095
purchase        12085
product_cart    11602
Name: event_name, dtype: int64

Merged Events and User Test Data

After merging the user test dataset with the event dataset, it is clear that many users engaged in multiple events each.

Purchases exceed those who visited the product cart page, so there is clearly another method of purchase on the site, such as a Buy It Now button.

Although login isn't necessary to engage in any of the other actions, that still accounts for most of the site events and product page views are the second most active event.

In [116]:
userstart=pd.read_csv('final_ab_new_users_us.csv',parse_dates=['first_date'])
In [24]:
userstart.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 61733 entries, 0 to 61732
Data columns (total 4 columns):
user_id       61733 non-null object
first_date    61733 non-null datetime64[ns]
region        61733 non-null object
device        61733 non-null object
dtypes: datetime64[ns](1), object(3)
memory usage: 1.9+ MB
In [25]:
userstart.sample()
Out[25]:
user_id first_date region device
51551 9D88CCC9ADB8EB49 2020-12-19 N.America PC
In [26]:
userstart.isnull().sum()
userstart[userstart.duplicated()]
Out[26]:
user_id first_date region device
In [27]:
userstart.user_id.nunique()
Out[27]:
61733
In [28]:
userstart.first_date.max()
Out[28]:
Timestamp('2020-12-23 00:00:00')

Enrollment Dataset Precis

There are sextuple as many users who started as unique users who engaged in events.

Their enrollment spans more than two weeks after launch date. That is a not particularly effective approach for an AB Test with the intent of assessing behavior over a two-week period in December, since many of those enrolled will not have been enrolled for two weeks.

On the up side, user activity likely peaks when they are most interested and that is when they invest the time to login or view product pages, as most of the events reflect. This data analysis will explore the question of how active users are over the amount of days that have passed since their enrollment.

In [117]:
#creating a df to help understand whether device usage on the site differs among test groups
devicecheck=pd.merge(eurtest,userstart)
In [118]:
#starting to create a df for funnel and churn analysis by paring down features 
userdateregion=userstart.drop('device',1)
In [119]:
#selecting EU users for EU AB Test df for analysis
userdateregioneu=userdateregion[userdateregion.region=='EU']
In [120]:
#merging EU user start dates with EU AB test and event df
eurtestevent=pd.merge(eurtestevent, userdateregioneu, how='left')
In [33]:
eurtestevent.isnull().sum()
Out[33]:
user_id           0
group             0
ab_test           0
event_dt          0
event_name        0
details       69842
first_date        0
region            0
dtype: int64
In [34]:
eurtestevent[eurtestevent.duplicated()]
Out[34]:
user_id group ab_test event_dt event_name details first_date region
In [35]:
eurtestevent.user_id.nunique()
Out[35]:
10850
In [36]:
eurtestevent.ab_test.unique()
Out[36]:
array(['interface_eu_test'], dtype=object)
In [37]:
eurtestevent.region.unique()
Out[37]:
array(['EU'], dtype=object)
In [121]:
#dropping columns with all the same value that is known
eurtestevent=eurtestevent.drop(['region','ab_test'],1)
In [39]:
eurtestevent.sample()
Out[39]:
user_id group event_dt event_name details first_date
65483 6492E424A2BC052D B 2020-12-18 09:02:03 login NaN 2020-12-16
In [40]:
eurtestevent.first_date.max()
eurtestevent.event_dt.max()
eurtestevent.first_date.min()
eurtestevent.event_dt.min()
Out[40]:
Timestamp('2020-12-07 00:02:48')
In [41]:
eurtestevent.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 81927 entries, 0 to 81926
Data columns (total 6 columns):
user_id       81927 non-null object
group         81927 non-null object
event_dt      81927 non-null datetime64[ns]
event_name    81927 non-null object
details       12085 non-null float64
first_date    81927 non-null datetime64[ns]
dtypes: datetime64[ns](2), float64(1), object(3)
memory usage: 4.4+ MB

Reading Datasets Summary

The AB Tests concern only the EU region and users in the two test groups, so it was important to forge a dataset that focused on factors for analysis. The combined dataset has nulls for most records, reflecting the lack of payment details for most events but does not contain nulls nor duplicates otherwise.

Datatype for user_id remains a string, because user_id contains letters. The two dates in this dataframe were assigned datetime format. Group and event names remain strings that showcase their actual content. Details remains float type because it contains a preponderance of null values and its number values are dollar figures showing cents.

A comparison of dates ensured that the first event time is subsequent to initial enrollment dates and that the last event date is a week after the last enrollment date. That is still not enough time to give everyone who enrolled two weeks to test their conversion rates.

All four events for both test groups have a healthy share of user activity to analyze conversion and provide a base for proportion hypothesis testing.

Exploratory Data Analysis

Many aspects of the data will be considered at this stage to ensure the integrity of the data being tested.

Checking if any A group users are in B group

In [81]:
df_a=eurtestevent[eurtestevent.group=='A']
df_b=eurtestevent[eurtestevent.group=='B']
In [82]:
bgroupby=df_b.groupby('user_id')['user_id'].unique()
bgroupby.columns = ['user_id']
agroupby=df_a.groupby('user_id')['user_id'].unique()
agroupby.columns = ['user_id']
In [83]:
bgrouplist=bgroupby.to_list()
In [45]:
len(bgrouplist)
Out[45]:
5383
In [84]:
agrouplist=agroupby.to_list()
In [47]:
len(agrouplist)
Out[47]:
5467
In [48]:
templist = [x for x in agrouplist if x not in bgrouplist]
In [49]:
len(templist)
Out[49]:
5467
In [50]:
templistb = [x for x in bgrouplist if x not in agrouplist]
In [51]:
len(templistb)
Out[51]:
5383

Summary of Checking for Users of Each Group

The above test found no group users are in the other group user list.

Checking how device usage on the site differs among test groups

In [85]:
devicecheck_b=devicecheck[devicecheck.group=='B']
devicecheck_a=devicecheck[devicecheck.group=='A']
In [53]:
devicecheck_a.device.unique()
Out[53]:
array(['Android', 'PC', 'iPhone', 'Mac'], dtype=object)
In [54]:
devicecheck_a.device.value_counts()
Out[54]:
Android    2580
PC         1511
iPhone     1147
Mac         593
Name: device, dtype: int64
In [86]:
device_alist=devicecheck_a.device.value_counts().to_list()
device_blist=devicecheck_b.device.value_counts().to_list()
In [87]:
b_deviceshareof_a = [b/a for b,a in zip(device_blist, device_alist)]
b_deviceshareof_a
Out[87]:
[1.0046511627906978,
 0.9497021839841164,
 0.9546643417611159,
 1.0354131534569984]
In [19]:
trace1 = go.Bar(
    x = devicecheck_b.device.unique(),
    y = devicecheck_b.device.value_counts(),
    name='B',
    text = device_blist,
    textposition='outside',
    marker=dict(
        color='sandybrown')
)
trace2 = go.Bar(
    x = devicecheck_a.device.unique(),
    y = devicecheck_a.device.value_counts(),
    name='A',
    text = device_alist,
    textposition='outside',
    marker=dict(
        color='skyblue')
)

data = [trace1, trace2]
layout = go.Layout(barmode='group',height=500,
                   title='Test Group Users by Device',xaxis_title="Device", yaxis_title="Users")

fig = go.Figure(data=data, layout=layout)
fig.show()

Summary of how device usage on the site differs among test groups

Group B users exceed those of Group A on Macs and Android devices. Group A users exceed those of Group B users on PCs and iPhones.

The percentage by which B users exceed A users on Androids and Macs is a slimmer margin than the percentage of Group A users on PC and iPhones. Group B users account for about 95% of the numbers of Group A using PCs and iPhones.

The balance of users among devices is equivalent enough as to not significantly denigrate the integrity of the test.

In [122]:
eurtestevent[eurtestevent.event_name=='purchase'].head()
spend_sum=eurtestevent[eurtestevent.event_name=='purchase'].groupby('user_id')['details'].sum()
eurtestevent[eurtestevent.event_name=='purchase'].groupby('user_id')['details'].sum().describe()
Out[122]:
count    3713.000000
mean       74.229346
std       130.200236
min         4.990000
25%        14.970000
50%        19.970000
75%       104.980000
max      1109.950000
Name: details, dtype: float64
In [89]:
ax=sns.distplot(spend_sum,color='skyblue',fit=norm,kde_kws={
    'color':"#3498DB",'lw':1.5,"label":'KDE'},hist_kws={'histtype':'stepfilled','lw':1,'alpha':.8},bins=40)
ax.set_title('Probability Density Distribution of Purchases for each User')
ax.set_ylabel('probability')
ax.set_xlabel('dollar expenditure');

Purchase per user

Half of all users who make purchases on the site spend less than 20 dollars.

The user who spent the most, a whopping \$1115, spent 223 times as much as the lowest spender at five dollars. That high spender and the entire top quartile, exceeding \\$110, skew the distribution for spending on this site far to the right.

In [90]:
print("Average number of events per user:",eurtestevent.event_name.count()/eurtestevent.user_id.nunique())
Average number of events per user: 7.274562211981567
In [123]:
eventsbyuser=eurtestevent.groupby(['user_id'])['event_name'].count().reset_index(
).sort_values(by='event_name',ascending=False)
In [92]:
#Distribution of the number of events by individual users
eventsbyuser.plot(kind='hist',ec='black',title='Events by User Distribution')
eventsbyuser.describe()
Out[92]:
event_name
count 10850.000000
mean 7.274562
std 4.158544
min 1.000000
25% 4.000000
50% 6.000000
75% 9.000000
max 32.000000

Events by User Distribution

A quarter of users do four or less activities on this site and half do at most six.

On average, users do between seven and eight activities on the site, an average that is pulled upward by the third quartile doing between six and 10 activities each and the upper quartile mostly doing 10 to 20 activities each.

The distribution for user activity is right skewed because a quarter of all users do more activities than the majority who do less than 10 each.

In [93]:
eventnames=eurtestevent.event_name.unique()
eventnames
Out[93]:
array(['product_cart', 'login', 'product_page', 'purchase'], dtype=object)
In [64]:
for i in eventnames:
    print("Number of events for",i,":",eurtestevent[eurtestevent.event_name==i].shape[0])
Number of events for product_cart : 11602
Number of events for login : 35145
Number of events for product_page : 23095
Number of events for purchase : 12085
In [65]:
for i in eventnames:
    print("Number of events for A group",i,":",df_a[df_a.event_name==i].shape[0])
Number of events for A group product_cart : 5766
Number of events for A group login : 17754
Number of events for A group product_page : 11750
Number of events for A group purchase : 6300
In [66]:
for i in eventnames:
    print("Number of events for B group",i,":",df_b[df_b.event_name==i].shape[0])
Number of events for B group product_cart : 5836
Number of events for B group login : 17391
Number of events for B group product_page : 11345
Number of events for B group purchase : 5785
In [67]:
for i in eventnames:
    print("Average number of times each A group user did event",i,round(df_a[
            df_a.event_name==i].shape[0]/df_a.user_id.nunique(),2),"while users in B group did it",round(df_b[
            df_b.event_name==i].shape[0]/df_b.user_id.nunique(),2))
Average number of times each A group user did event product_cart 1.05 while users in B group did it 1.08
Average number of times each A group user did event login 3.25 while users in B group did it 3.23
Average number of times each A group user did event product_page 2.15 while users in B group did it 2.11
Average number of times each A group user did event purchase 1.15 while users in B group did it 1.07

Activity by Date

In [68]:
eurtestevent.first_date.describe()
Out[68]:
count                   81927
unique                     17
top       2020-12-14 00:00:00
freq                     7924
first     2020-12-07 00:00:00
last      2020-12-23 00:00:00
Name: first_date, dtype: object
In [124]:
usersbystartdate=eurtestevent.groupby(eurtestevent['first_date'].dt.date)['user_id'].nunique()
userstartdatedf=usersbystartdate.to_frame().reset_index()
userstartdatedf.columns=['start_date','users']
userstartdatedf
Out[124]:
start_date users
0 2020-12-07 957
1 2020-12-08 551
2 2020-12-09 401
3 2020-12-10 483
4 2020-12-11 413
5 2020-12-12 723
6 2020-12-13 753
7 2020-12-14 994
8 2020-12-15 552
9 2020-12-16 383
10 2020-12-17 557
11 2020-12-18 595
12 2020-12-19 595
13 2020-12-20 759
14 2020-12-21 1132
15 2020-12-22 587
16 2020-12-23 415
In [147]:
sortedstart=userstartdatedf.sort_values(by='start_date',ascending=False)
In [126]:
#unique users per date
usersbydate=eurtestevent.groupby(eurtestevent['event_dt'].dt.date)['user_id'].nunique()
userdatedf=usersbydate.to_frame().reset_index()
userdatedf.columns=['date','users']
userdatedf
Out[126]:
date users
0 2020-12-07 920
1 2020-12-08 1007
2 2020-12-09 1002
3 2020-12-10 1076
4 2020-12-11 1114
5 2020-12-12 1486
6 2020-12-13 1690
7 2020-12-14 2068
8 2020-12-15 1875
9 2020-12-16 1716
10 2020-12-17 1774
11 2020-12-18 1838
12 2020-12-19 1865
13 2020-12-20 2076
14 2020-12-21 2586
15 2020-12-22 2298
16 2020-12-23 2097
17 2020-12-24 1592
18 2020-12-26 1126
19 2020-12-27 1016
20 2020-12-28 908
21 2020-12-29 760
22 2020-12-30 6
In [165]:
sorteduserdatedf=userdatedf.sort_values(by='date',ascending=False)
In [139]:
print(np.mean(userdatedf.users.head(-1)))
print(np.median(userdatedf.users.head(-1)))
1540.4545454545455
1641.0
In [140]:
mean_users_perdate=round(np.mean(userdatedf.users),2)
print(mean_users_perdate)
print(np.median(userdatedf.users))
1473.74
1592.0
In [141]:
quarterpercentile_userenrollment= np.percentile(userstartdatedf.users,25)
quarterpercentile_userenrollment
Out[141]:
483.0
In [158]:
labels=[]
labels= sorteduserdatedf['users'].apply(lambda x: sorteduserdatedf.users)

color=[]
color = sorteduserdatedf['users'].apply(lambda x: 'springgreen' if x > mean_users_perdate else 'red')
color2=[]
color2 = sortedstart['users'].apply(lambda x: 'palegreen' if x > quarterpercentile_userenrollment else '#F778A1')
    
plt.figure(figsize=(25,15))
eventusers=plt.hlines(y=np.arange(0,24),xmin=0,xmax=sorteduserdatedf['users'],color=color,linewidth=13)
startusers=plt.hlines(y=np.arange(6,23),xmin=0,xmax=sortedstart['users'],color=color2,linewidth=13)
plt.plot(sorteduserdatedf['users'], np.arange(
    0,23), "*",markersize=29, markeredgewidth=.5, markeredgecolor='gold', markerfacecolor='#FFD801',label=labels)
plt.plot(sortedstart['users'], np.arange(
    6,23), "8",markersize=18, markeredgewidth=.4, markeredgecolor='yellow', markerfacecolor='#FFE87C',label=labels)
plt.yticks(np.arange(0,24), sorteduserdatedf['date'],fontsize=15)
plt.xticks(fontsize=15)

plt.ylabel("Date",fontsize=17)
plt.xlabel("Users",fontsize=17)

legendiction = {'above average user events' : 'springgreen', 'top 75% enrollment' : 'palegreen','below average user events': 'red', 'below 25th percentile user enrollment' : '#F778A1'}

useractlegend = []
for key in legendiction:
        data_key = mpatches.Patch(color=legendiction[key], label=key)
        useractlegend.append(data_key)

plt.legend(handles=useractlegend,numpoints=4,fontsize='x-large',title='User Activity Legend',title_fontsize='x-large')

plt.title("Users By Start and Event Date",fontsize=20)

plt.show();

Activity by Date Summary

Two weeks after the launch date, the participating users doubled. That double rate continued for the four days leading up to Christmas.

For the entire two weeks leading up to Christmas, the amount of active users exceeded the mean. It only dropped below the mean again on Christmas day and continued to drop to virtually nothing by the end of the month.

User participation only exceeded its mean value on 14 days, all but two days that were start dates exceeding the 25 percentile for enrollment of users participating in A/B test. On three dates that enrollment fell below that 25th percentile, participation dropped below the mean. The launch date is the only date that user enrollment exceeded user participation.

Enrollment stopped on the 23rd and participation continued to be strong that day and the next, Christmas Eve. Starting Christmas day, participation plummeted.

The primary unusual aspect to this dataset is the fact that the A/B test was launched in December to assess website conversion, rather than anything related to Christmas itself. Consumer behavior across industries and retail is significantly impacted by holiday shopping, weather and in 2020, the pandemic's impact on product supply. (In December 2020, many retailers, such as Lego.com, are out of nearly their entire stock and not accepting orders for items not in stock.) December factors need to be taken into account when measuring any other retail data, as in the case of this dataset.

Conversion

In [159]:
#portion of users who performed each of the actions. 
#sorted by users who did each action at least once
print("Users who performed each action, sorted by users who did each action at least once.")
eachaction=eurtestevent.groupby(['event_name'])['user_id'].nunique().sort_values(ascending=False)
eachaction.columns=['times']
eachaction   
Users who performed each action, sorted by users who did each action at least once.
Out[159]:
event_name
login           10849
product_page     7166
purchase         3713
product_cart     3568
Name: user_id, dtype: int64
In [160]:
print("These are the ratios of users who did each action, conversion, sorted by frequency.")
userratios=eachaction/eurtestevent.user_id.nunique()
userratios=userratios.to_frame()
userratios=userratios.reset_index()

userratios=userratios.style.format({
    'user_id': '{:,.2f}'.format
})
userratios
These are the ratios of users who did each action, conversion, sorted by frequency.
Out[160]:
event_name user_id
0 login 1.00
1 product_page 0.66
2 purchase 0.34
3 product_cart 0.33
In [166]:
atleastonce=eurtestevent.groupby(['user_id','event_name'])['event_dt'].count().reset_index()
print("This is a table of how many times each user did this event.")
atleastonce.columns=['user_id','event','times']
atleastonceframe=pd.DataFrame(atleastonce)
atleastonceframe.head()
This is a table of how many times each user did this event.
Out[166]:
user_id event times
0 0002CE61FF2C4011 login 4
1 0002CE61FF2C4011 product_cart 4
2 0002CE61FF2C4011 product_page 4
3 001064FEAAB631A1 login 3
4 001064FEAAB631A1 product_page 3
In [167]:
totalevents_byuser=atleastonceframe.groupby('user_id')['times','event'].sum().reset_index()
totalevents_byuser.head()
Out[167]:
user_id times
0 0002CE61FF2C4011 12
1 001064FEAAB631A1 6
2 001E72F50D1C48FA 6
3 002412F1EB3F6E38 6
4 002540BE89C930FB 9
In [80]:
totalevents_byuser.describe()
Out[80]:
times
count 10850.000000
mean 7.550876
std 4.266909
min 1.000000
25% 4.000000
50% 6.000000
75% 10.000000
max 32.000000
In [168]:
purchasedf=atleastonce[atleastonce.event=='purchase']
purchasedf.head()
Out[168]:
user_id event times
13 0031F1B5E9FBF708 purchase 2
21 004C58ADE7CA8C4A purchase 2
25 0050F43F34C955F4 purchase 1
28 0053DD654C9513D6 purchase 3
43 0082295A41A867B5 purchase 5
In [82]:
purchasedf.describe()
Out[82]:
times
count 3713.000000
mean 3.254781
std 1.340775
min 1.000000
25% 2.000000
50% 3.000000
75% 4.000000
max 9.000000
In [169]:
timesvalues=purchasedf.times.value_counts
In [170]:
x_values = pd.Series(range(0,len(purchasedf)))

# random state for reproducibility
np.random.seed(500)

# areas and colors
N = 3713
rn = 2 * np.random.rand(N)
theta = np.pi * np.random.rand(N)
area = 150 * rn**2
colors = theta

sns.set_style('darkgrid')
fig = plt.figure(figsize=(6,6))
ax = fig.add_subplot(111, projection='polar')
c = ax.scatter(x_values, purchasedf['times'], c=colors, s=area, cmap='cool', alpha=0.75)
plt.title("Purchase events per User")
Out[170]:
Text(0.5, 1.05, 'Purchase events per User')
In [171]:
x_values = pd.Series(range(0,len(totalevents_byuser)))

# random state for reproducibility
np.random.seed(500)

# areas and colors
N = 10850
rn = 2 * np.random.rand(N)
theta = np.pi * np.random.rand(N)
area = 150 * rn**2
colors = theta

sns.set_style('darkgrid')
fig = plt.figure(figsize=(6,6))
ax = fig.add_subplot(111, projection='polar')
c = ax.scatter(x_values, totalevents_byuser['times'], c=colors, s=area, cmap='rainbow_r', alpha=0.75)
plt.title("Total events per User");
In [172]:
print("These are the events done by users who each did more than one event.")
multipletimes=atleastonce[atleastonce.times>=2].groupby(['event'])['user_id'].nunique().sort_values(ascending=False)
multipletimes
These are the events done by users who each did more than one event.
Out[172]:
event
login           9877
product_page    6504
purchase        3358
product_cart    3229
Name: user_id, dtype: int64
In [87]:
print("These are the ratios of users who did more than one event")
print("So out of 34% who made an initial purchase, 91.7% returned to make another purchase.")
round(multipletimes/eachaction,3)
These are the ratios of users who did more than one event
So out of 34% who made an initial purchase, 91.7% returned to make another purchase.
Out[87]:
event
login           0.923
product_page    0.921
purchase        0.917
product_cart    0.919
Name: user_id, dtype: float64

Conversion Summary

Users who came to the site did so repeatedly.

Of the two-thirds who visited the product page, 92% returned to visit the product page.

Of the one-third the third who made a purchase, 92% returned to make another purchase. The same percentages are true for the product cart as for the purchases.

Each activity can stand on its own because no single activity is required to do another activity. That said, double the ratio of users visit product pages and there is no doubt that the action of visiting a product page contributes to the knowledge and desire of a user to consider filling a product cart or making a purchase.

In [173]:
usersbyeventpop=eurtestevent.groupby('event_name').agg({'user_id':'nunique'}).sort_values('user_id',ascending=False).reset_index()
usersbyeventpop.columns=['events','users']
In [174]:
usersbyeventpop['conversion']=round(usersbyeventpop['users']/(eurtestevent['user_id'].nunique()),2)
usersbyeventpop
Out[174]:
events users conversion
0 login 10849 1.00
1 product_page 7166 0.66
2 purchase 3713 0.34
3 product_cart 3568 0.33
In [175]:
re_sort_index=[0, 2, 1, 3]
In [176]:
#creating a funnel
funnel=eurtestevent.groupby(['event_name'])['user_id'].nunique().reset_index()
In [177]:
fun_order = pd.Categorical(['login', 'product_page','product_cart' ,'purchase'],
              ordered=True)
funnel_sorted=funnel.sort_index(level=fun_order)
In [178]:
funnel_sorted=funnel_sorted.reindex(re_sort_index)
In [94]:
funnel_sorted
Out[94]:
event_name user_id
0 login 10849
2 product_page 7166
1 product_cart 3568
3 purchase 3713
In [179]:
funnel_sorted['percent_change']=funnel_sorted['user_id'].pct_change()
funnel_sorted
Out[179]:
event_name user_id percent_change
0 login 10849 NaN
2 product_page 7166 -0.339478
1 product_cart 3568 -0.502093
3 purchase 3713 0.040639
In [180]:
group_a=eurtestevent[eurtestevent.group=="A"].groupby(['event_name','group'])['user_id'].nunique().reset_index(
    ).sort_values(by='user_id',ascending=False)
group_a_sorted=group_a.reindex(re_sort_index)
eventlist_a=group_a_sorted.event_name
user_a_list=group_a_sorted.user_id
In [181]:
group_b=eurtestevent[eurtestevent.group=="B"].groupby(['event_name','group'])['user_id'].nunique().reset_index(
    ).sort_values(by='user_id',ascending=False)
group_b_sorted=group_b.reindex(re_sort_index)
eventlist_b=group_b_sorted.event_name
user_b_list=group_b_sorted.user_id
In [182]:
funfig= go.Figure()
funfig.add_trace(go.Funnel(
    name='A',
    y=eventlist_a,
    x=user_a_list,
    text='A',
    textposition='inside',
    textinfo='text+value+percent initial+percent total',
    marker = {"color": ['lightcyan','skyblue','dodgerblue','steelblue']}))

funfig.add_trace(go.Funnel(
    name='B',
    y=eventlist_b,
    x=user_b_list,
    textposition='inside',
    text='B',
    textinfo='text+value+percent initial+percent total',
    marker = {"color": ['sandybrown','burlywood','rosybrown','peru']}))

funfig.update_layout(title="Interactive Funnel for Both Experiment Groups")

Funnel Precis

There is a perfect split in each testing group of half of the users who visited the product page and half who didn't. In group B there is a perfect split of a quarter of users who visited the product cart and a quarter who made a purchase, and three-quarters who did neither. In group A, the share of users who did each event is approximately a quarter.

In [183]:
#How many users in each group do each event
pivot= eurtestevent.pivot_table(
    index='event_name', values='user_id', columns='group', aggfunc=lambda x: x.nunique()).reset_index()
pivot
Out[183]:
group event_name A B
0 login 5466 5383
1 product_cart 1756 1812
2 product_page 3639 3527
3 purchase 1931 1782
In [184]:
pivot=pivot.reindex([0, 2, 1, 3])
pivot
Out[184]:
group event_name A B
0 login 5466 5383
2 product_page 3639 3527
1 product_cart 1756 1812
3 purchase 1931 1782
In [101]:
eurtestevent.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 81927 entries, 0 to 81926
Data columns (total 6 columns):
user_id       81927 non-null object
group         81927 non-null object
event_dt      81927 non-null datetime64[ns]
event_name    81927 non-null object
details       12085 non-null float64
first_date    81927 non-null datetime64[ns]
dtypes: datetime64[ns](2), float64(1), object(3)
memory usage: 4.4+ MB
In [185]:
usersbothdate=eurtestevent.groupby([eurtestevent.event_dt.dt.date,'first_date'])['user_id'].count()
usersbothdate=usersbothdate.to_frame().reset_index()
usersbothdate.columns=['event_date','start_date','users']
usersbothdate.event_date=usersbothdate.event_date.astype('datetime64[ns]')
usersbothdate['life']=(usersbothdate.event_date-usersbothdate.start_date)
usersbothdate['life_num'] = usersbothdate['life'].dt.days.astype('int16')
In [103]:
usersbothdate.head(2)
Out[103]:
event_date start_date users life life_num
0 2020-12-07 2020-12-07 2068 0 days 0
1 2020-12-08 2020-12-07 1019 1 days 1
In [187]:
lifepivot= usersbothdate.pivot_table(index='life',
    values='users',aggfunc={'users':[np.sum,np.mean]}).reset_index()
lifepivot.columns=['life','avg_users','sum_users']
lifepivot.avg_users=round(lifepivot.avg_users,1)
In [189]:
lifepivot.head(15)
Out[189]:
life avg_users sum_users life_num
0 0 days 1480.0 25160.0 0.0
1 1 days 742.9 12629.0 1.0
2 2 days 510.4 8167.0 2.0
3 3 days 369.3 5909.0 3.0
4 4 days 283.8 4541.0 4.0
5 5 days 245.6 3930.0 5.0
6 6 days 211.4 3383.0 6.0
7 7 days 182.9 2743.0 7.0
8 8 days 173.4 2427.0 8.0
9 9 days 148.8 1935.0 9.0
10 10 days 132.1 1585.0 10.0
11 11 days 114.6 1261.0 11.0
12 12 days 106.7 1067.0 12.0
13 13 days 93.9 845.0 13.0
14 14 days 105.8 846.0 14.0
In [106]:
lifepivot[lifepivot.life>'14 days'].sum_users.sum()
Out[106]:
2843.0
In [190]:
lifepivot['life_num']=lifepivot.life/np.timedelta64(1,'D')
lifepivot.life_num.values
Out[190]:
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.,
       13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23.])
In [191]:
lifecolorscale=["springgreen", "yellow","orange","red"]
reverse=lifepivot.sort_values(('life_num'), ascending=False)

fig = px.scatter(lifepivot, x="life_num", y="sum_users", color='life_num', color_continuous_scale=lifecolorscale,
                 hover_data=[
                     'life_num','sum_users','avg_users'],title="Users by days of Life: Hover for details")

fig.update_traces(marker=dict(size=(reverse.life_num.values+15), line=dict(width=2,
                                        color='lemonchiffon')), selector=dict(mode='markers'))

fig.update_layout(coloraxis_colorbar=dict(
    title="Life in days"),xaxis=dict(title='Life in Days'),yaxis=dict(title='Sum of Users'))

fig.show()

User Participation Since Enrollment

This shows that on the first day of enrollment, there are the most active users. The amount of user activity dwindles as the amount of time passes from their start date.

The lowest amount of users comes to the site on the greatest amount of days subsequent to their enrollment.

In [192]:
pd.DataFrame(lifepivot).head()
Out[192]:
life avg_users sum_users life_num
0 0 days 1480.0 25160.0 0.0
1 1 days 742.9 12629.0 1.0
2 2 days 510.4 8167.0 2.0
3 3 days 369.3 5909.0 3.0
4 4 days 283.8 4541.0 4.0
In [193]:
lifepivot.life_num= lifepivot.life/np.timedelta64(1,'D')
In [194]:
eventsbydate=eurtestevent.groupby([eurtestevent.event_dt.dt.date,'event_name'])['user_id'].count()
eventsbydate=eventsbydate.to_frame().reset_index()
eventsbydate.columns=['event_date','event','users']
eventsbydate.event_date = pd.to_datetime(eventsbydate.event_date)
eventsbydate.event_date = eventsbydate.event_date.dt.date
eventsbydate.sample(3)
Out[194]:
event_date event users
24 2020-12-13 login 1690
22 2020-12-12 product_page 971
60 2020-12-22 login 2298
In [113]:
eventsbydate.head()
Out[113]:
event_date event users
0 2020-12-07 login 903
1 2020-12-07 product_cart 270
2 2020-12-07 product_page 586
3 2020-12-07 purchase 309
4 2020-12-08 login 996
In [195]:
sortedeventsbydate=eventsbydate.sort_values(by='event_date',ascending=False)
In [196]:
colorbyevent={'login':'cyan','product_page':'springgreen','product_cart':'orange','purchase':'magenta'}
fig = px.scatter(eventsbydate, x="users", y="event_date", size='users',color='event',color_discrete_map=colorbyevent,
                 category_orders={"event": ["login", "product_page", "product_cart", "purchase"]},
                 hover_data=['event','event_date'],title="Events by Date: Hover for details")

fig.update_traces(marker=dict(line=dict(width=2,
                                        color='lemonchiffon')),
                  selector=dict(mode='markers'))
fig.show()

Events by Date Precis

The amount of logins always exceeds participation in other activities and product page visits are the next in line. The amount of activity of all types is at its strongest from five days after the launch until Christms Eve and from then on activity dwindles.

The greatest increases in activity over that strong period are in the two most popular activities, log in and product page visits. Purchases nearly consistently exceed product cart activity.

In [116]:
eurtestevent.first_date.value_counts()
Out[116]:
2020-12-14    7924
2020-12-07    7869
2020-12-21    7580
2020-12-12    5854
2020-12-13    5846
2020-12-20    5132
2020-12-08    4794
2020-12-19    4242
2020-12-17    4186
2020-12-18    4176
2020-12-15    4166
2020-12-10    4149
2020-12-22    3684
2020-12-11    3555
2020-12-09    3407
2020-12-16    2930
2020-12-23    2433
Name: first_date, dtype: int64
In [197]:
#How many users in each group do each event
pivot= eurtestevent.pivot_table(
    index='event_name', values='user_id', columns='group', aggfunc=lambda x: x.nunique()).reset_index()
pivot=pivot.reindex([0, 2, 1, 3])
pivot
Out[197]:
group event_name A B
0 login 5466 5383
2 product_page 3639 3527
1 product_cart 1756 1812
3 purchase 1931 1782

Exploratory Data Analysis Summary

In exploratory analysis, it became clear that the events are not entirely intertwined nor required. In that way, there can be more users who make purchases than fill product carts. Although login and product page views are not a required step, both of these activities are so active, that they can and likely do make an impact on the likelihood of the cart and purchase activities.

Group A users on average did every event except product cart slightly more often than group B users. Nonetheless, the total activities for each group were fairly equivalent for all categories except for purchasing, in which group A had 108% of the purchase events that group B had.

Although most users made four or less purchases and spent less than 20 dollars, distributions for both purchase frequency and amount skews to the right reflecting an active top quartile with 4 to 9 purchases each and a top spender quartile that spends more than \$110 each and as much as \\$1115.

The timing of Christmas during this data gathering has a strong impact on events. All activity in all categories tapers down after Christmas Eve. Purchase events dwindle down to less than half the amount as during the ten days leading up to Christmas.

Hypothesis Testing

Primary question for investigation: Is there a significant statistical difference between test groups in conversion - the proportion of all users in the test who participate in each event?

First there will be AA testing to ensure that the experiment is statistically fair. Ideally, the AA test will show no difference in conversion. If the AA test shows a difference, then that would point to a problem in the sample. If it does not show a difference, then that points to the test having been implemented correctly.

In [118]:
df_a_shuffle = df_a.sample(frac = 1) 
In [119]:
half_df_a=df_a_shuffle.shape[0]/2
half_df_a=int(half_df_a)
half_df_a
Out[119]:
20785
In [120]:
def makenew(df, rows) :
    top = df.head(rows)
    bottom = df.tail(len(df)-rows)
    return top, bottom
In [121]:
# Split dataframe into top and bottom
top_a_shuffle, bottom_a_shuffle = makenew(df_a_shuffle, half_df_a)
In [122]:
pivot_a= top_a_shuffle.pivot_table(
    index='event_name', values='user_id', columns='group', aggfunc=lambda x: x.nunique()).reset_index()
pivot_a=pivot_a.reindex(re_sort_index)
pivot_a
Out[122]:
group event_name A
0 login 4628
2 product_page 3080
1 product_cart 1484
3 purchase 1645
In [123]:
pivot_a2= bottom_a_shuffle.pivot_table(
    index='event_name', values='user_id', columns='group', aggfunc=lambda x: x.nunique()).reset_index()
pivot_a2=pivot_a2.reindex(re_sort_index)
pivot_a2
Out[123]:
group event_name A
0 login 4624
2 product_page 3060
1 product_cart 1509
3 purchase 1665
In [124]:
def check_hypothesis_aa(group1,group2,event,alpha=0.05):
    successes1=pivot_a[pivot_a.event_name==event][group1].iloc[0]
    successes2=pivot_a2[pivot_a2.event_name==event][group2].iloc[0]
   
    trials1 = top_a_shuffle[top_a_shuffle.group==group1]['user_id'].nunique()
    trials2 = bottom_a_shuffle[bottom_a_shuffle.group==group2]['user_id'].nunique()

    #success proportion in the first group:
    p1= successes1/trials1

    #success proportion in the second group:
    p2= successes2/trials2

    #success proportion in the combined dataset:
    p_combined = (successes1 + successes2) / (trials1 + trials2)

    #the difference between the datasets' proportions
    difference = p1 - p2

    #calculating the statistic in standard deviations of the standard normal distribution
    z_value = difference / math.sqrt(p_combined * (1-p_combined) * (1/trials1 + 1/trials2))

    #setting up the standard normal distribution (mean 0, standard deviation 1)
    distr = st.norm(0,1)

    p_value = (1-distr.cdf(abs(z_value))) * 2
    
    print(
    "H0: There is no statistical difference in conversion for",event,
    "\nH1: There is a statistically significant proportional difference in conversion rate for",event,'\n')
    
    print(f'p_value: {p_value:.6f}')
    print("This test for Event:",event)

    if (p_value < alpha):
        print("Rejecting the null hypothesis: there is a significant difference between the proportions.")
    else:
        print("Failed to reject the null hypothesis: there is no reason to consider the proportions different." )
In [125]:
check_hypothesis_aa('A', 'A', 'product_cart',alpha=0.05)
H0: There is no statistical difference in conversion for product_cart 
H1: There is a statistically significant proportional difference in conversion rate for product_cart 

p_value: 0.601388
This test for Event: product_cart
Failed to reject the null hypothesis: there is no reason to consider the proportions different.
In [126]:
check_hypothesis_aa('A', 'A', 'product_page',alpha=0.05)
H0: There is no statistical difference in conversion for product_page 
H1: There is a statistically significant proportional difference in conversion rate for product_page 

p_value: 0.665286
This test for Event: product_page
Failed to reject the null hypothesis: there is no reason to consider the proportions different.
In [127]:
check_hypothesis_aa('A', 'A', 'purchase',alpha=0.05)
H0: There is no statistical difference in conversion for purchase 
H1: There is a statistically significant proportional difference in conversion rate for purchase 

p_value: 0.688707
This test for Event: purchase
Failed to reject the null hypothesis: there is no reason to consider the proportions different.
In [128]:
check_hypothesis_aa('A', 'A', 'login',alpha=0.05)
H0: There is no statistical difference in conversion for login 
H1: There is a statistically significant proportional difference in conversion rate for login 

p_value: 0.837733
This test for Event: login
Failed to reject the null hypothesis: there is no reason to consider the proportions different.

Results of AA Testing

AA testing shows no difference in conversion. The experiment is statistically fair.

AB Hypothesis Testing

In [129]:
def check_hypothesis(group1,group2,event,alpha=0.05):
    successes1=pivot[pivot.event_name==event][group1].iloc[0]
    successes2=pivot[pivot.event_name==event][group2].iloc[0]
   
    trials1 = eurtestevent[eurtestevent.group==group1]['user_id'].nunique()
    trials2 = eurtestevent[eurtestevent.group==group2]['user_id'].nunique()

    #success proportion in the first group:
    p1= successes1/trials1

    #success proportion in the second group:
    p2= successes2/trials2

    #success proportion in the combined dataset:
    p_combined = (successes1 + successes2) / (trials1 + trials2)

    #the difference between the datasets' proportions
    difference = p1 - p2

    #calculating the statistic in standard deviations of the standard normal distribution
    z_value = difference / math.sqrt(p_combined * (1-p_combined) * (1/trials1 + 1/trials2))

    #setting up the standard normal distribution (mean 0, standard deviation 1)
    distr = st.norm(0,1)

    p_value = (1-distr.cdf(abs(z_value))) * 2
    
    print(
    "H0: There is no statistical difference in conversion for",event,
    "\nH1: There is a statistically significant proportional difference in conversion rate for",event,'\n')
    
    print(f'p_value: {p_value:.6f}')
    print("This test for Event:",event)

    if (p_value < alpha):
        print("Rejecting the null hypothesis: there is a significant difference between the proportions.")
    else:
        print("Failed to reject the null hypothesis: there is no reason to consider the proportions different." )
In [130]:
check_hypothesis('A', 'B', 'product_cart',alpha=0.05)
H0: There is no statistical difference in conversion for product_cart 
H1: There is a statistically significant proportional difference in conversion rate for product_cart 

p_value: 0.087469
This test for Event: product_cart
Failed to reject the null hypothesis: there is no reason to consider the proportions different.
In [131]:
check_hypothesis('A', 'B', 'product_page',alpha=0.05)
H0: There is no statistical difference in conversion for product_page 
H1: There is a statistically significant proportional difference in conversion rate for product_page 

p_value: 0.251842
This test for Event: product_page
Failed to reject the null hypothesis: there is no reason to consider the proportions different.
In [132]:
check_hypothesis('A', 'B', 'purchase',alpha=0.05)
H0: There is no statistical difference in conversion for purchase 
H1: There is a statistically significant proportional difference in conversion rate for purchase 

p_value: 0.014959
This test for Event: purchase
Rejecting the null hypothesis: there is a significant difference between the proportions.
In [133]:
check_hypothesis('A', 'B', 'login',alpha=0.05)
H0: There is no statistical difference in conversion for login 
H1: There is a statistically significant proportional difference in conversion rate for login 

p_value: 0.321035
This test for Event: login
Failed to reject the null hypothesis: there is no reason to consider the proportions different.

Results of AB Testing

AB testing shows no difference in conversion for 3 out of four events.

For the most important event, purchase, AB testing shows that there is a significant difference in proportions.

In Conclusion

All told, the A/B Test was a great success. The users who enrolled in the test showed a great proclivity to return to the site and repeat their actions, providing significant activity records to conduct a robust test of their conversion.

The beauty of this site is the independence of actions, so that no one action is contingent upon another. Nonetheless, there is no doubt that the profusion of logins and product page views in some way contribute to the third of the users who have product cart activity and the third of the users who make purchases.

The hypothesis testing found a significant difference between testing groups in conversion for purchasing activity. The A/B test helped the company to narrow down the impact of the change tested and hone in on this one, critical activity. That gives the company a fruitful realm of research ahead, to clarify how best to leverage that group's experience to boost purchases by the overall user base.

In [ ]: